The application of brute force logistic regression to corporate credit scoring models: Evidence from Serbian financial statements
نویسندگان
چکیده
In this paper a brute force logistic regression (LR) modeling approach is proposed and used to develop predictive credit scoring model for corporate entities. The modeling is based on 5 years of data from end-of-year financial statements of Serbian corporate entities, as well as, default event data. To the best of our knowledge, so far no relevant research about predictive power of financial ratios derived from Serbian financial statements has been published. This is also the first paper that generated 350 financial ratios to represent independent variables for 7590 corporate entities default predictions’. Many of derived financial ratios are new and were not discussed in literature before. Weight of evidence (WOE) method has been applied to transform and prepare financial ratios for brute force LR fitting simulations. Clustering method has been utilized to reduce long list of variables and to remove highly correlated financial ratios from partitioned training and validation datasets. The clustering results have revealed that number of variables can be reduced to short list of 24 financial ratios which are then analyzed in terms of default event predictive power. In this paper we propose the most predictive financial ratios from financial statements of Serbian corporate entities. The obtained short list of financial ratios has been used as a main input for brute force LR model simulations. According to literature, common practice to select variables in final model is to run stepwise, forward or backward LR. However, this research has been conducted in a way that the brute force LR simulations have to obtain all possible combinations of models that comprise of 5–14 independent variables from the short list of 24 financial ratios. The total number of simulated resulting LR models is around 14 million. Each model has been fitted through extensive and time consuming brute force LR simulations using SAS code written by the authors. The total number of 342,016 simulated models (‘‘well-founded’’ models) has satisfied the established credit scoring model validity conditions. The well-founded models have been ranked according to GINI performance on validation dataset. After all well-founded models have been ranked, the model with highest predictive power and consisting of 8 financial ratios has been selected and analyzed in terms of receiver-operating characteristic curve (ROC), GINI, AIC, SC, LR fitting statistics and correlation coefficients. The financial ratio constituents of that model have been discussed and benchmarked with several models from rele-
منابع مشابه
Matrix Sequential Hybrid Credit Scorecard Based on Logistic Regression and Clustering
The Basel II Accord pointed out benefits of credit risk management through internal models to estimate Probability of Default (PD). Banks use default predictions to estimate the loan applicants’ PD. However, in practice, PD is not useful and banks applied credit scorecards for their decision making process. Also the competitive pressures in lending industry forced banks to use profit scorecards...
متن کاملApplication of Genetic Algorithm in Development of Bankruptcy Predication Theory Case Study: Companies Listed on Tehran Stock Exchange
The bankruptcy prediction models have long been proposedas a key subject in finance. The present study, therefore, makes aneffort to examine the corporate bankruptcy prediction through employmentof the genetic algorithm model. Furthermore, it attempts to evaluatethe strategies to overcome the drawbacks of ordinary methods forbankruptcy prediction through application of genetic algorithms. Thesa...
متن کاملEnhancing credit scoring model performance by a hybrid scoring matrix
Competition of the consumer credit market in Taiwan has become severe recently. Therefore, most financial institutions actively develop credit scoring models based on assessments of the credit approval of new customers and the credit risk management of existing customers. This study uses a genetic algorithm for feature selection and decision trees for customer segmentation. Moreover, it utilize...
متن کاملCredit Scoring Models for a Tunisian Microfinance Institution: Comparison between Artificial Neural Network and Logistic Regression
This paper compares, for a microfinance institution, the performance of two individual classification models: Logistic Regression (Logit) and Multi-Layer Perceptron Neural Network (MLP), to evaluate the credit risk problem and discriminate good creditors from bad ones. Credit scoring systems are currently in common use by numerous financial institutions worldwide. However, credit scoring using ...
متن کاملThe Comparison of Credit Risk between Artificial Neural Network and Logistic Regression Models in Tose-Taavon Bank in Guilan
One of the most important issues always facing banks and financial institutes is the issue of credit risk or the possibility of failure in the fulfillment of obligations by applicants who are receiving credit facilities. The considerable number of banks’ delayed loan payments all around the world shows the importance of this issue and the necessary consideration of this topic. Accordingly...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Expert Syst. Appl.
دوره 40 شماره
صفحات -
تاریخ انتشار 2013